Target Cost of F0 Based on Pol Concatenative Speec

نویسندگان

  • Kei Fujii
  • Hideki Kashioka
  • Nick Campbell
چکیده

This paper proposes a target cost function for F0 based on polynomial regression for use in concatenative speech synthesis. Polynomial regression is used to express the time series of F0 continuously, and remove effects of microprosody. We conducted a perceptual experiment and confirmed that the proposed function provides a higher correlation with perceptual scores than does the conventionally used cost function.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Objective Distance Measures for S Concatenative Speec

In unit selection based concatenative speech systems, join cost, which measures how well two units can be joined together, is one of the main criteria for selecting appropriate units from the inventory. The ideal join cost will measure perceived discontinuity, based on easily measurable spectral properties of the units being joined, in order to ensure smooth and natural-sounding synthetic speec...

متن کامل

Quasi-syllabic and quasi-articula concatenative speec

In this paper we propose methods of speech segmentation and unit characterization which are motivated by prosodic and physiological principles. In particular, we motivate and describe algorithms for unit-database creation on the basis of quasi-syllables and quasi-articulatory-gestures defined and parameterized purely by acoustic measurements. This approach is intended to overcome the burden of ...

متن کامل

Comparing spectral distance measures for join cost optimization in concatenative speech synthesis

In concatenative synthesis the join cost function can be related to the probability of a perceived discontinuity at the join. Therefore it is important that the distance measures in the cost function correlate highly with human perceived discontinuities. In this paper the results of a listening test on joins in two Norwegian long vowels: /A:/ and /e:/, is presented. Five spectral distance measu...

متن کامل

Emotion conversion using F0 segment selection

This paper describes F0 segment selection, a novel syllablebased F0 conversion method, which provides a concatenative framework to search for F0 segments in a modest corpus of emotional speech (∼15 minutes of data). The method is compared with our earlier work on F0 generation using contextsensitive syllable HMMs. Both methods are complemented with a duration conversion module as well as GMM-ba...

متن کامل

Use of pitch pattern improvement in the CHATR speech synthesis system

A corpus-based concatenative speech synthesis system using no signal processing can produce intelligible synthetic speech maintaining original voice characteristics, but it can sometimes be di cult to realize natural prosody. In such a concatenative system, it is very important to select appropriate waveform segments that are naturally close to the target prosody. This paper describes some appr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003